Functional enrichment of associated variants in epigenetic data sets

Data import & munging

The main goal of this first section is to import two datasets–a foreground and background SNP list–and convert them to GRanges objects friendly to funcivar analysis. The foreground consists of Bayesian False Discovery Probability <0.5 SNPs from the study from X regions, and the background list includes all imputed SNPs of the surrounding 1MB at those regions. These lists were obtained from Johnathan Tyrer at U. Cambridge, UK.

Let’s get started with the foreground list. We need to import the list of names and genome interval locations from a file called “bfdp05.list” and transform it into a GenomicRanges object.

fg <- read.delim(file="bfdp05.list", header = TRUE) # foreground list
fg <- mutate(fg, chrom_label=paste("chr", Chromosome, sep=""))
fg$chrom_label[fg$chrom_label=="chr23"] <- "chrX"
fg.gr <- GRanges(seqnames = fg$chrom_label,
                 ranges = IRanges(start = fg$Position,
                                  end = fg$Position),
                 strand = "*")
fg.gr$blockStarts <- start(fg.gr)
fg.gr$blockStarts <- start(fg.gr)
fg.gr$blockSizes <- end(fg.gr)-start(fg.gr)+1
## for bdfp < 0.5 SNPs "effect"=allele1, "baseline"=allele2
values(fg.gr) <- DataFrame(label=fg$SNP, linkage_group=fg$Chromosome, 
                           allele1=fg$Effect, 
                           allele2=fg$Baseline,
                           blockStarts=start(fg.gr),
                           blockSizes=end(fg.gr)-start(fg.gr))

Next we’ll do similar for the backgroung list. The background variants consist of all the variants within 1 MB flanking on either side of the primary signal for which associations were tested. This list is kept in a file called “Selected_SNP_regions.txt”. To format it for funciVAR we need to import it and recreate it as a GenomicRanges object:

bg <- read.delim(file = "Selected_SNP_regions.txt", header = TRUE) # background list
bg <- mutate(bg, chrom_label=paste("chr", Chromosome, sep = ""))
bg$chrom_label[bg$chrom_label=="chr23"] <- "chrX"
bg.gr <- GRanges(seqnames = bg$chrom_label,
                 ranges = IRanges(start = bg$Position,
                                  end = bg$Position),
                 strand = "*")
values(bg.gr) <- DataFrame(label=bg$SNP, linkage_group=bg$Chromosome, 
                           allele1 = bg$A1, 
                           allele2 = bg$A2, 
                           blockStarts=start(bg.gr),
                           blockSizes=end(bg.gr)-start(bg.gr))

background.bed <- list(seqnames = seqnames(bg.gr), starts = start(bg.gr), ends = end(bg.gr), names = mcols(bg.gr)$label)
cat("browser position chr1:157878955-158078955\ntrack name='ASIAN BKG SNPS' description='Background SNPs for Ovarian Cancer study in Asian population'\n", file = "asian_oc_background.hg19.bed")
write.table(background.bed, file = "asian_oc_background.hg19.bed", sep = '\t', row.names = FALSE, col.names = FALSE, quote = FALSE, append = TRUE)

Sanity check

Let’s check whether any of the foreground SNPs overlap background, i.e. are they a subset. Then we’ll assign them to a variants object.

overlaps.mat <- findOverlaps(fg.gr, bg.gr)
length(queryHits(overlaps.mat))
## [1] 1313
length(fg.gr)
## [1] 1283
length(bg.gr)
## [1] 1836907
unique(seqnames(bg.gr))
##  [1] chr1  chr2  chr3  chr4  chr6  chr7  chr8  chr9  chr10 chr12 chr13
## [12] chr14 chr16 chr18 chr20 chr21 chrX 
## 17 Levels: chr1 chr2 chr3 chr4 chr6 chr7 chr8 chr9 chr10 chr12 ... chrX
aoc.variants <- list(fg=fg.gr, bg=bg.gr)

From this it appears that yes, foreground is a subset of background. There are a greater number of queryHits than query rows because some variants are larger than one nucleotide and can overlap more than one row in the subject.

Annotations

The annotations are mostly in-house generated epigenomics data from prior studies, rather than the StateHub treatments of Roadmap data that we typically use. I import these using the “GetSegmentations()” functionality of funciVar.

annos <- list.files("biofeatures", full.names = TRUE)
biofeatures <- GetBioFeatures(files=annos, genome="hg19") # hang on! this will take a while

Enrichment analysis

Global enrichment

First we’ll execute the standard global enrichment analysis that funciVar does, plotting enrichment values for each set of epigenetics data on and individual basis.

enrichment <- CalculateEnrichment(aoc.variants, biofeatures, feature.type="biofeatures", CI = 0.95, return.overlaps = TRUE)
#save(aoc.variants, biofeatures, file="asiandata.Rda")
global.plot <- PlotEnrichment(enrichment$enrichment, value = "difference", color.by = NULL) + 
  ggtitle("Global Enrichment")

#ggsave(filename = "plots/global_enrichment_plot.pdf", plot = global.plot, device = "pdf", width = 18, height = 12, units = "in")
#ggsave(filename = "plots/global_enrichment_plot.png", plot = global.plot, device = "png", dpi = 300, width = 18, height = 12, units = "in")

Next we’ll create our overlap matrix for export as a supplementary table (ST7).

Locus-by-locus enrichment

Next we want to see whether individual loci are enriched for particular samples, using the same background SNPs as in the global analysis. We’ll have to define the extent of each locus and then subset the SNP lists so that we can repeat a funciVar analysis on this smaller set. The individual loci are defined in a file called “regions.txt”

locus <- with(read.delim(file="regions.txt", sep="\t"), GRanges(seqnames = seqnames, ranges = IRanges(start = start, end = end), region = region))

#define a function to create an enrichment analysis for all SNPs that overlap an interval in the locus table
l<-dim(enrichment$enrichment)[1]
w<-length(locus)
enrichment.mat <- list(probability = data.frame(matrix(NA, nrow = l, ncol = w+1)), # +1 extra column=global
                       difference  = data.frame(matrix(NA, nrow = l, ncol = w+1)))
rownames(enrichment.mat$probability) <- rownames(enrichment$enrichment)
colnames(enrichment.mat$probability) <- c("global", as.character(locus$region))
rownames(enrichment.mat$difference) <- rownames(enrichment$enrichment)
colnames(enrichment.mat$difference) <- c("global", as.character(locus$region))

enrichment.mat$probability$global <- unlist(enrichment$enrichment$probability)
enrichment.mat$difference$global <- unlist(enrichment$enrichment$difference)


for (a in 1:w) {
  print(paste(as.character(locus[a]$region)[1], "Enrichment", sep=" "))
  locus.variants <- list(fg=fg.gr[subjectHits(findOverlaps(locus[a], fg.gr))],
                         bg=bg.gr[subjectHits(findOverlaps(locus[a], bg.gr))])
  a.locus <- CalculateEnrichment(locus.variants, biofeatures, feature.type="biofeatures", CI = 0.95, return.overlaps = FALSE)
  enrichment.mat$probability[rownames(a.locus),a+1] <- a.locus$probability
  enrichment.mat$difference[rownames(a.locus),a+1] <- a.locus$difference
  a.plot <- PlotEnrichment(a.locus, value = "difference", color.by = NULL) + ggtitle(paste(locus[a]$region[1], "enrichment", sep=" "))
  ggsave(filename = paste("plots/allmarks/", locus[a]$region[1], ".pdf", sep=""), plot = a.plot, device = "pdf", width = 18, height = 12, units = "in")
  ggsave(filename = paste("plots/allmarks/", locus[a]$region[1], ".png", sep=""), plot = a.plot, device = "png", width = 18, height = 12, units = "in")
}
## [1] "1q23.1 Enrichment"
## [1] "2p25.2-25.1 Enrichment"
## Warning in SetOverlaps(variants$fg, features): No overlaps were found for
## this variant/feature combination

## [1] "2p24.2 Enrichment"
## [1] "2p16.1-p15 Enrichment"
## Warning in SetOverlaps(variants$fg, features): No overlaps were found for
## this variant/feature combination

## [1] "2q37.3 Enrichment"
## [1] "3p25.3-2 Enrichment"
## Warning in SetOverlaps(variants$fg, features): No overlaps were found for
## this variant/feature combination

## [1] "3p24.2-1 Enrichment"
## Warning in SetOverlaps(variants$fg, features): No overlaps were found for
## this variant/feature combination

## [1] "3p23-22.3 Enrichment"

## [1] "4p15.33 Enrichment"

## [1] "6p25.2-1 Enrichment"

## [1] "6p21.33-31 Enrichment"

## [1] "6q26-27 Enrichment"

## [1] "7p21.3 Enrichment"
## [1] "7q36.3 Enrichment"
## Warning in SetOverlaps(variants$fg, features): No overlaps were found for
## this variant/feature combination

## [1] "8p12 Enrichment"
## Warning in SetOverlaps(variants$fg, features): No overlaps were found for
## this variant/feature combination

## [1] "8q24.21 Enrichment"

## [1] "9q22.1-2 Enrichment"

## [1] "10q25.2 Enrichment"

## [1] "12q24.32-33 Enrichment"

## [1] "13q14.2 Enrichment"
## [1] "14q31.3-32.11 Enrichment"
## Warning in SetOverlaps(variants$fg, features): No overlaps were found for
## this variant/feature combination

## [1] "16q12.1 Enrichment"

## [1] "16q23.1-2 Enrichment"

## [1] "18q21.33-22.1 Enrichment"

## [1] "20p11.23-21 Enrichment"

## [1] "20q11.21-22 Enrichment"

## [1] "21q21.2-3 Enrichment"

## [1] "21q22.11 Enrichment"

## [1] "Xq22.3 Enrichment"

annos.meta <- data.frame(read.delim("Asian_samplenames.csv", sep = ',', row.names = 1))[rownames(enrichment.mat$probability),]
annos.meta$Global <- enrichment$enrichment$probability
annos.meta$log1pGlobal <- log1p(enrichment$enrichment$probability)
annos.meta$GlobalDiff <- enrichment$enrichment$difference


ann_colors <- list(
  Classification  = c(CCOC      = "#66C2A5",
                      EnOC      = "#FC8D62",
                      HGSOC     = "#8DA0CB",
                      MOC       = "#E78AC3",
                      OC        = "#A6D854",
                      Other     = "#FFD92F",
                      Precursor = "#E5C494"),
  Feature         = c(CTCF      = "#7FC97F",
                      H3K27ac   = "#BEAED4",
                      H3K4me1   = "#FDC086",
                      H3K4me3   = "#FFFF99",
                      NDR       = "#386CB0",
                      PAX8      = "#F0027F"),
  Tissue.CellLine = c(CL = "black", T = "white"),
  Global          = colorRampPalette(c("grey", "white", "firebrick3"),bias=0.025)(280),
  log1pGlobal     = colorRampPalette(c("grey", "white", "firebrick3"),bias=0.025)(280),
  GlobalDiff      = colorRampPalette(c("grey", "white", "firebrick3"),bias=0.025)(280)
)

pheatmap(enrichment.mat$difference[,2:w], labels_row = as.character(annos.meta$SampleName), annotation_row = annos.meta[,c("GlobalDiff", "Classification", "Feature", "Tissue.CellLine")], main = "Biofeature enrichment by locus (diff w/ bkg)", filename = "plots/locus_heat_allmarks_diff.png", res = 92, width = 14, height = 14, color = colorRampPalette(c("grey", "white", "firebrick3"),bias=2.5)(280), annotation_colors = ann_colors, clustering_method = "ward.D2")

pheatmap(log1p(enrichment.mat$probability)[,2:w], labels_row = as.character(annos.meta$SampleName), annotation_row = annos.meta[,c("log1pGlobal", "Classification", "Feature", "Tissue.CellLine")], main = "Biofeature enrichment by locus (log1p)", filename = "plots/locus_heat_allmarks_logp.png", res = 92, width = 14, height = 14, color = colorRampPalette(c("grey", "white", "firebrick3"),bias=0.025)(280), annotation_colors = ann_colors, clustering_method = "ward.D2")
row.names(enrichment.mat$probability)
##  [1] "CaOV3_FAIRE_NA.bed"                                                       
##  [2] "CaOV3_H3K27ac_NA.bed"                                                     
##  [3] "CaOV3_H3K4me1_NA.bed"                                                     
##  [4] "ClearCell_241_H3K27ac_hg19.nodup.tagAlign.naive_overlap.narrowPeak"       
##  [5] "ClearCell_511_H3K27ac_hg19.nodup.tagAlign.naive_overlap.narrowPeak"       
##  [6] "ClearCell.3172_H3K27ac_hg19.nodup.tagAlign.naive_overlap.narrowPeak"      
##  [7] "ClearCell.3547_H3K27ac_hg19.nodup.tagAlign.naive_overlap.narrowPeak"      
##  [8] "ClearCell.3588_H3K27ac_hg19.nodup.tagAlign.naive_overlap.narrowPeak"      
##  [9] "E097.OVARY.DNase.HG19.macs2.narrowPeak"                                   
## [10] "E097.OVARY.H3K27ac.HG19.narrowPeak"                                       
## [11] "E097.OVARY.H3K4me1.HG19.narrowPeak"                                       
## [12] "E097.OVARY.H3K4me3.HG19.narrowPeak"                                       
## [13] "E117.HELAS3.H3K4me1.HG19.narrowPeak"                                      
## [14] "E117.HELAS3.REMC.DNase.HG19.macs2.narrowPeak"                             
## [15] "E117.HELAS3.REMC.H3K27ac.HG19.narrowPeak"                                 
## [16] "E117.HELAS3.REMC.H3K4me3.HG19.narrowPeak"                                 
## [17] "EA_HeyA8_H3K27ac_hg19.nodup.tagAlign.naive_overlap.narrowPeak"            
## [18] "EA_IGROV1_H3K27ac_hg19.nodup.tagAlign.naive_overlap.narrowPeak"           
## [19] "ENCFF575AUU.HCT116.H3K4me3.STAM.HG19.lifted.over.from.hg38.replicated.bed"
## [20] "ENCODE_HCT116_CTCF_merged_replicates.hg19.bed"                            
## [21] "ENCODE_HCT116_H3K27ac_merged_replicates.hg19.bed"                         
## [22] "Endometrioid_291_H3K27ac_hg19.nodup.tagAlign.naive_overlap.narrowPeak"    
## [23] "Endometrioid_381_H3K27ac_hg19.nodup.tagAlign.naive_overlap.narrowPeak"    
## [24] "Endometrioid_589.Tu_H3K27ac_hg19.nodup.tagAlign.naive_overlap.narrowPeak" 
## [25] "Endometrioid_630_H3K27ac_hg19.nodup.tagAlign.naive_overlap.narrowPeak"    
## [26] "Endometrioid_703_H3K27ac_hg19.nodup.tagAlign.naive_overlap.narrowPeak"    
## [27] "GSM1663010_011_iEEC16_H3K27ac_merged_replicates_hg19.bed"                 
## [28] "GSM1663012_013_iEEC16_H3K4me1_merged_replicates.hg19.bed"                 
## [29] "GSM1663016_017_iFTSEC246_FAIRE_merged_replicates.hg19.bed"                
## [30] "GSM1663018_019_iFTSEC246_H3K27ac_merged_replicates_hg19.bed"              
## [31] "GSM1663020_021_iFTSEC246_H3K4me1_merged_replicates.hg19.bed"              
## [32] "GSM1663024_025_iFTSEC33_FAIRE_merged_replicates.hg19.bed"                 
## [33] "GSM1663026_027_iFTSEC33_H3K27ac_merged_replicates.hg19.bed"               
## [34] "GSM1663028_029_iFTSEC33_H3K4me1_merged_replicates.hg19.bed"               
## [35] "GSM1663033_034_iOSE11_FAIRE_merged_replicates.hg19_rCRSchrm.bed"          
## [36] "GSM1663035_036_iOSE11_H3K27ac_merged_replicates.hg19.bed"                 
## [37] "GSM1663037_038_iOSE11_H3K4me1_merged_replicates.hg19.bed"                 
## [38] "GSM1663041_042_043_044_iOSE4_FAIRE_merged_replicates.hg19.bed"            
## [39] "GSM1663045_046_iOSE4_H3K27ac_merged_replicates.hg19.bed"                  
## [40] "GSM1663047_048_iOSE4_H3K4me1_merged_replicates.hg19.bed"                  
## [41] "GSM2107940_FT246_PAX8_peaks.clean.hg19.bed"                               
## [42] "GSM2107942_Kuromachi_PAX8_peaks.clean.hg19.bed"                           
## [43] "GSM2107945_OVSAHO_PAX8_peaks.clean.hg19.bed"                              
## [44] "GSM2107947_JHOS4_PAX8_peaks.clean.hg19.bed"                               
## [45] "GTFR230.CTCF.hg19.narrowpeak"                                             
## [46] "HeyA8_PAX8.conservative.IDR0.05.filt.narrowPeak"                          
## [47] "HG_Serous_550_H3K27ac_hg19.nodup.tagAlign.naive_overlap.narrowPeak"       
## [48] "HGSerous_229_H3K27ac_hg19.nodup.tagAlign.naive_overlap.narrowPeak"        
## [49] "HGSerous_270_H3K27ac_hg19.nodup.tagAlign.naive_overlap.narrowPeak"        
## [50] "HGSerous_429_H3K27ac_hg19.nodup.tagAlign.naive_overlap.narrowPeak"        
## [51] "HGSerous_561_H3K27ac_hg19.nodup.tagAlign.naive_overlap.narrowPeak"        
## [52] "IGROV_PAX8.conservative.IDR0.05.filt.narrowPeak"                          
## [53] "JHOC.CTCF.hg19.narrowpeak"                                                
## [54] "MCAS.CTCF.hg19.narrowpeak"                                                
## [55] "Mucinous_230_H3K27ac_hg19.nodup.tagAlign.naive_overlap.narrowPeak"        
## [56] "Mucinous_380_H3K27ac_hg19.nodup.tagAlign.naive_overlap.narrowPeak"        
## [57] "Mucinous_464_H3K27ac_hg19.nodup.tagAlign.naive_overlap.narrowPeak"        
## [58] "Mucinous_652_H3K27ac_hg19.nodup.tagAlign.naive_overlap.narrowPeak"        
## [59] "Mucinous.3724_H3K27ac_hg19.nodup.tagAlign.naive_overlap.narrowPeak"       
## [60] "NR_Kuramochi_H3K27ac_hg19.nodup.tagAlign.naive_overlap.narrowPeak"        
## [61] "OAW.CTCF.hg19.narrowpeak"                                                 
## [62] "RMG.CTCF.hg19.narrowpeak"                                                 
## [63] "SG_GTFR230_H3K27ac_hg19.nodup.tagAlign.naive_overlap.narrowPeak"          
## [64] "SG_JHOC_H3K27ac_hg19.nodup.tagAlign.naive_overlap.narrowPeak"             
## [65] "SG_MCAS_H3K27ac_hg19.nodup.tagAlign.naive_overlap.narrowPeak"             
## [66] "SG_OAW_H3K27ac_hg19.nodup.tagAlign.naive_overlap.narrowPeak"              
## [67] "SG_RMG_H3K27ac_hg19.nodup.tagAlign.naive_overlap.narrowPeak"              
## [68] "UWB1.289_FAIRE_NA.bed"                                                    
## [69] "UWB1.289_H3K27ac_NA.bed"                                                  
## [70] "UWB1.289_H3K4me1_NA.bed"                                                  
## [71] "GSM1663008_009_iEEC16_FAIRE_merged_replicates.hg19.bed"                   
## [72] "GSM2107936_FT33_PAX8_peaks.clean.hg19.bed"                                
## [73] "GSM2107938_FT194_PAX8_peaks.clean.hg19.bed"

HM-specific analyses

Histone-specific enrichment for H3K27ac

h3k4me1  <- c(3, 11, 13, 28, 31, 34, 37, 40, 70)
h3k27ac  <- c(2, 4:8, 10, 15, 17, 18, 21:27, 33, 36, 39, 47:51, 55:60, 63:67, 69)
h3k4me3  <- c(12, 16, 19)
dhsfaire <- c(1, 9, 14, 29, 32, 35, 38, 68, 71)
ctcf     <- c(20, 45, 53, 54, 61, 62)
pax8     <- c(41:44, 46, 52, 72, 73)
pax8     <- c(41:44, 46, 52, 72, 73)


levels(annos.meta$Classification)
## [1] "CCOC"      "EnOC"      "HGSOC"     "MOC"       "OC"        "Other"    
## [7] "Precursor"
pheatmap(log1p(enrichment.mat$probability[h3k27ac,2:w]), labels_row = as.character(annos.meta$SampleName[h3k27ac]), annotation_row = annos.meta[h3k27ac, c("log1pGlobal", "Classification", "Tissue.CellLine")], main = "H3K27Ac enrichment", color = colorRampPalette(c("grey", "white", "firebrick3"), bias=0.025)(280), annotation_colors = ann_colors, clustering_method = "ward.D2")

pheatmap(log1p(enrichment.mat$probability[h3k27ac,2:w]), labels_row = as.character(annos.meta$SampleName[h3k27ac]), annotation_row = annos.meta[h3k27ac, c("log1pGlobal", "Classification", "Tissue.CellLine")], filename = "plots/locus_heat_h3k27ac.png", res = 92, width = 14, height = 10, color = colorRampPalette(c("grey", "white", "firebrick3"), bias=0.025)(280), main = "H3K27Ac enrichment", annotation_colors = ann_colors, clustering_method = "ward.D2")

pheatmap(log1p(enrichment.mat$probability[h3k27ac,2:w]), labels_row = as.character(annos.meta$SampleName[h3k27ac]), annotation_row = annos.meta[h3k27ac, c("log1pGlobal", "Classification", "Tissue.CellLine")], filename = "plots/locus_heat_h3k27ac.pdf", width = 14, height = 10, color = colorRampPalette(c("grey", "white", "firebrick3"), bias=0.025)(280), main = "H3K27Ac enrichment", annotation_colors = ann_colors, clustering_method = "ward.D2")

pheatmap(log1p(enrichment.mat$probability[h3k4me1,2:w]), labels_row = as.character(annos.meta$SampleName[h3k4me1]), annotation_row = annos.meta[h3k4me1, c("log1pGlobal", "Classification", "Tissue.CellLine")], color = colorRampPalette(c("grey", "white", "firebrick3"), bias=0.025)(280), main = "H3K4me1 enrichment", annotation_colors = ann_colors, clustering_method = "ward.D2")

pheatmap(log1p(enrichment.mat$probability[h3k4me1,2:w]), labels_row = as.character(annos.meta$SampleName[h3k4me1]), annotation_row = annos.meta[h3k4me1, c("log1pGlobal", "Classification", "Tissue.CellLine")], filename = "plots/locus_heat_h3k4me1.png", res = 92, width = 14, height = 4, color = colorRampPalette(c("grey", "white", "firebrick3"), bias=0.025)(280), main = "H3K4me1 enrichment", annotation_colors = ann_colors, clustering_method = "ward.D2")

pheatmap(log1p(enrichment.mat$probability[dhsfaire,2:w]), labels_row = as.character(annos.meta$SampleName[dhsfaire]), annotation_row = annos.meta[dhsfaire, c("log1pGlobal", "Classification", "Tissue.CellLine")], color = colorRampPalette(c("grey", "white", "firebrick3"), bias=0.025)(280), main = "DHS and FAIRE enrichment", annotation_colors = ann_colors, clustering_method = "ward.D2")

pheatmap(log1p(enrichment.mat$probability[dhsfaire,2:w]), labels_row = as.character(annos.meta$SampleName[dhsfaire]), annotation_row = annos.meta[dhsfaire, c("log1pGlobal", "Classification", "Tissue.CellLine")], filename = "plots/locus_heat_dhsfaire.png", res = 92, width = 14, height = 4, color = colorRampPalette(c("grey", "white", "firebrick3"), bias=0.025)(280), main = "DHS and FAIRE enrichment", annotation_colors = ann_colors, clustering_method = "ward.D2")

#pheatmap(log1p(enrichment.mat$probability[pax8,2:w]), labels_row = as.character(annos.meta$SampleName[pax8]), annotation_row = annos.meta[pax8, c("log1pGlobal", "Classification", "Tissue.CellLine")], color = colorRampPalette(c("grey", "white", "firebrick3"), bias=0.025)(280), main = "PAX8 ChIP-seq enrichment", annotation_colors = ann_colors, clustering_method = "ward.D2")

#pheatmap(log1p(enrichment.mat$probability[pax8,2:w]), labels_row = as.character(annos.meta$SampleName[pax8]), annotation_row = annos.meta[pax8, c("log1pGlobal", "Classification", "Feature", "Tissue.CellLine")], filename = "plots/locus_heat_pax8.png", res = 92, width = 14, height = 10, color = colorRampPalette(c("grey", "white", "firebrick3"), bias=0.025)(280), main = "Pax8 ChIP-seq enrichment", annotation_colors = ann_colors, clustering_method = "ward.D2")

pheatmap(log1p(enrichment.mat$probability[ctcf,2:w]), labels_row = as.character(annos.meta$SampleName[ctcf]), annotation_row = annos.meta[ctcf, c("log1pGlobal", "Classification", "Tissue.CellLine")], color = colorRampPalette(c("grey", "white", "firebrick3"), bias=0.025)(280), main = "CTCF ChIP-seq enrichment", annotation_colors = ann_colors, clustering_method = "ward.D2")

pheatmap(log1p(enrichment.mat$probability[ctcf,2:w]), labels_row = as.character(annos.meta$SampleName[ctcf]), annotation_row = annos.meta[ctcf, c("log1pGlobal", "Classification", "Tissue.CellLine")], color = colorRampPalette(c("grey", "white", "firebrick3"), bias=0.025)(280), filename = "plots/locus_heat_ctcf.png", res = 92, width = 14, height = 3.4, annotation_colors = ann_colors, clustering_method = "ward.D2")

pheatmap(log1p(enrichment.mat$probability[h3k4me3,2:w]), labels_row = as.character(annos.meta$SampleName[h3k4me3]), annotation_row = annos.meta[h3k4me3, c("log1pGlobal", "Classification", "Tissue.CellLine")], color = colorRampPalette(c("grey", "white", "firebrick3"), bias=0.025)(280), main = "H3K4me3 enrichment", annotation_colors = ann_colors, clustering_method = "ward.D2")

pheatmap(log1p(enrichment.mat$probability[h3k4me3,2:w]), labels_row = as.character(annos.meta$SampleName[h3k4me3]), annotation_row = annos.meta[h3k4me3, c("log1pGlobal", "Classification", "Tissue.CellLine")], color = colorRampPalette(c("grey", "white", "firebrick3"), bias=0.025)(280), filename = "plots/locus_heat_h3k4me3.png", res = 92, width = 14, height = 2.5, annotation_colors = ann_colors, clustering_method = "ward.D2")

Motif Disruption

Filter SNPs to single-nucleotide variants overlapping epigenetic features.